Predictive modeling in case-control single-nucleotide polymorphism studies in the presence of population stratification: a case study using Genetic Analysis Workshop 16 Problem 1 dataset
نویسندگان
چکیده
In this paper, we apply the gradient-boosting machine predictive model to the rheumatoid arthritis data for predicting the case-control status. QQ-plot suggests severe population stratification. In univariate genome-wide association studies, a correction factor for ethnicity confounding can be derived. Here we propose a novel strategy to deal with population stratification in the context of multivariate predictive modeling. We address the problem by clustering the subjects on the axes of genetic variations, and building a predictive model separately in each cluster. This allows us to control ethnicity without explicitly including it in the model, which could marginalize the genetic signal we are trying to discover. Clustering not only leads to more similar ethnicity groups but also, as our results show, increases the accuracy of our model when compared to the non-clustered approach. The highest accuracy is achieved with the model adjusted for population stratification, when the genetic axes of variation are included among the set of predictors, although this may be misleading given the confounding effects.
منابع مشابه
Single-nucleotide polymorphism of rs11061971 (+219 A>T) in adiponectin receptor 2 (AdipoR2) gene and its association with risk of type 2 diabetes among an Iranian population
Background and Objectives: Genetic modifications in the adiponectin receptor 2 (AdipoR2) gene can affect phenotypes associated with insulin resistance and diabetes. The purpose of this study was to evaluate the possible role of genetic modifications in the AdipoR2 gene, to determine the frequency of genotypes and polymorphism alleles of this gene at rs11061971 (+219 A>T), and to investigate its...
متن کاملAccommodating population stratification in case-control association analysis: a new test and its application to genome-wide study on rheumatoid arthritis
It is well known that conventional association tests can lead to excessive false positives when there is population stratification. We propose a new test for detecting genetic association with a case-control study design. Unlike some other methods for handling population stratification, we treat the cases as a population and the controls as another one even though each of them may be a mixture ...
متن کاملAssociation study of two single nucleotide polymorphisms rs10757278 and rs1333049 with atherosclerosis, a case-control study from Iraq
Atherosclerosis is one of the most important coronary artery disease (CAD) caused by lipid accumulation, hypertension, smoking, and many other factors such as environmental and genetic factors. It has been recorded that genetic variations in rs10757278 and rs1333049 are correlated with CAD. In the present study, 100 blood samples were collected (50 CAD patients and 50 appeared to be healthy con...
متن کاملEvaluation of the Association of Htr2a Gene Rs6313 Polymorphism with Heroin Dependence in a Sample from Northwest Iran
Introduction: Heroin dependence is a chronic relapsing disorder caused by a combination of genetic, epigenetic, and environmental factors. The genetic contribution in the vulnerability to heroin dependence is 40%-60%. Alterations in dopamine transport in the CNS are implicated in drug and alcohol dependence, and according to linkage studies, the HTR2A rs6313 single nucleotide polymorphism plays...
متن کاملGenetic polymorphisms in the estrogen receptor - α Gene codon 325(CCC}CCG) and risk of breast cancer among Iranian women: a case control study
Abstract Background: The Iranian breast cancer patients are relatively younger than their Western counterparts. Evidence suggests that alterations in estrogen signaling pathways , including estrogen receptor-α (ER- α ), occur during breast cancer development in Caucasians. Epidemiologic studies have revealed that age-incidence patterns of breast cancer in Asians differ from those in Cauca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 3 شماره
صفحات -
تاریخ انتشار 2009